Multi-Document Arabic Summarization Using Text Clustering to Reduce Redundancy
نویسندگان
چکیده
“The process of multi-document summarization is producing a single summary of a collection of related documents. In this work we focus on generic extractive Arabic multi-document summarizers. We also describe the cluster approach for multi-document summarization. The problem with multi-document text summarization is redundancy of sentences, and thus, redundancy must be eliminated to ensure coherence, and improve readability. Hence, we set out the main objective as to examine multi-document summarization salient information for text Arabic summarization task with noisy and redundancy information. Finally, the final summary results for the ten categories of related documents are evaluated using Recall and Precision with the best Recall achieved is 0.6 and Precision is 0.6.”
منابع مشابه
A Proposed Textual Graph Based Model for Arabic Multi-document Summarization
Text summarization task is still an active area of research in natural language preprocessing. Several methods that have been proposed in the literature to solve this task have presented mixed success. However, such methods developed in a multi-document Arabic text summarization are based on extractive summary and none of them is oriented to abstractive summary. This is due to the challenges of...
متن کاملUser-Focused Multi-Document Summarization with Paragraph Clustering and Sentence-Type Filtering
Applying document clustering techniques to multidocument summarization is a challenging problem, mostly because of the redundancy that exists in multiple sources. We compare several document clustering techniques for multi-document summarization in the NTCIR-4 TSC test collection. We conducted an experiment to evaluate the effectiveness of reducing redundancy in the production of summaries. Fro...
متن کاملA multi-document summarization system based on statistics and linguistic treatment
The massive quantity of data available today in the Internet has reached such a huge volume that it has become humanly unfeasible to efficiently sieve useful information from it. One solution to this problem is offered by using text summarization techniques. Text summarization, the process of automatically creating a shorter version of one or more text documents, is an important way of finding ...
متن کاملAutomatic Multi-Document Arabic Text Summarization Using Clustering and Keyphrase Extraction
Automatic text summarization has become important due to the rapid growth of information texts since it is very difficult for human beings to manually summarize large documents of texts. A full understanding of the document is essential to form an ideal summary. However, achieving full understanding is either difficult or impossible for computers. Therefore, selecting important sentences from t...
متن کاملUsing a Double Clustering Approach to Build Extractive Multi-document Summaries
This paper presents a method for extractive multi-document summarization that explores a two-phase clustering approach. First, sentences are clustered by similarity, and one sentence per cluster is selected, to reduce redundancy. Then, in order to group them according to topics, those sentences are clustered considering the collection of keywords that represent the topics in the set of texts. E...
متن کامل